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Abstract 

We investigate the kinetics of loop formation in flexible ideal polymer chains (Rouse model), 
and polymers in good and poor solvents. We show for the Rouse model, using a modification 
of the theory of Szabo, Schulten, and Schulten, that the time scale for cyclization is r c ~ tqN 2 
(where To is a microscopic time scale and N is the number of monomers), provided the coupling 
between the relaxation dynamics of the end-to-end vector and the looping dynamics is taken 
into account. The resulting analytic expression fits the simulation results accurately when a, 
the capture radius for contact formation, exceeds b, the average distance between two connected 
beads. Simulations also show that, when a < b, t c ~ N aT , where 1.5 < a T < 2 in the range 
7 < N < 200 used in the simulations. By using a diffusion coefficient that is dependent on the 
length scales a and b (with a < b), which captures the two-stage mechanism by which looping 
occurs when a < b, we obtain an analytic expression for r c that fits the simulation results well. 
The kinetics of contact formation between the ends of the chain are profoundly affected when 
interactions between monomers are taken into account. Remarkably, for TV < 100 the values 
of t c decrease by more than two orders of magnitude when the solvent quality changes from 
good to poor. Fits of the simulation data for r c to a power law in N (r c ~ N" T ) show that a T 
varies from about 2.4 in a good solvent to about 1.0 in poor solvents. The effective exponent 
a T decreases as the strength of the attractive monomer-monomer interactions increases. Loop 
formation in poor solvents, in which the polymer adopts dense, compact globular conformations, 
occurs by a reptation-like mechanism of the ends of the chain. The time for contact formation 
between beads that are interior to the chain in good solvents changes non-monotonically as loop 
length varies. In contrast, the variation is monotonic in poor solvents. The implications of our 
results for contact formation in polypeptide chains, RNA, and single stranded DNA are briefly 
outlined. 

1 Introduction 

Contact formation (cyclization) between the ends of a long polymer has been intensely studied both 
experimentally [1, 2] and theoretically [3, 4, 5, 6, 7, 8, 9]. More recently, the kinetics of loop formation 
has become increasingly important largely because of its relevance to DNA looping [10, 11] as well 
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as protein [12, 13, 14, 15, 16, 17, 18, 19] and RNA folding [20]. The case of cyclization in DNA, 
which is a measure of its intrinsic flexibility [11, 21], is important in gene expression and interactions 
with proteins and RNA. In addition, the formation of contacts between residues (nucleotides) near 
the loop [8] may be the key nucleating event in protein (RNA) folding. For these reasons, a number 
of experiments have probed the dependence of the rates of cyclization in proteins [12, 13, 22] and 
RNA [23, 24] as a function of loop length. The experimental reports, especially on rates of loop 
formation in polypeptides and proteins, have prompted a number of theoretical studies [7, 25, 26] 
that build on the pioneering treatments due to Wilemski and Fixman [3] (WF) and Szabo, Schulten, 
and Schulten [4] (SSS). The WF formalism determines the loop closure time t c by solving the 
diffusion equation in the presence of a sink term. The sink function accounts for the possibility that 
contact between the ends of a polymer chain occurs whenever they are in proximity. The time for 
forming a loop is related to a suitable time integral of the sink-sink correlation function. 

In an important paper, SSS developed a much simpler theory to describe the dependence of 
the rate of end-to-end contact formation in an ideal chain on the polymer length N. The SSS 
approximation [4] describes the kinetics of contact formation between the ends of the chain as a 
diffusion process in an effective potential that is derived from the probability distribution P(R ee ) of 
finding the chain ends with the end-to-end distance R ee . More recently, such an approach has been 
adopted to obtain rates of folding of proteins from a free energy surface expressed in terms of an 
appropriately chosen reaction coordinate [27]. The validity of using the dynamics in a potential of 
mean force, P(R ee ) ~ — ksTlog[P (R ee )], to obtain t c hinges on local equilibrium being satisfied, i.e. 
that all processes except the one of interest must occur rapidly. In the case of cyclization kinetics in 
simple systems (Rouse model or self-avoiding polymer chains), the local equilibrium approximation 
depends minimally on the cyclization time r c , and the internal chain relaxation time tr. In the limit 
t c /tr ^> 1, one can envision the motions of the ends as occurring in the effective free energy F(R ee ), 
because the polymer effectively explores the available volume before the ends meet. By solving the 
diffusion equation for an ideal chain for which F(R ee ) ~ 3k B TRl e /2Rl e , with R ee ~ b>/~N, where 
b is the monomer size, subject to absorbing boundary conditions, SSS showed that the mean first 
passage time for contact formation (~ r c ) is tsss ~ r n N^ , where To is a microscopic time constant 
(see below, eq. (7)). 

The simplicity of the SSS result, which reduces contact-formation kinetics to merely computing 
P(R ee ), has resulted in its widespread use to fit experimental data on polypeptide chains [12, 13, 22]. 
The dependence of r c on N using the SSS theory differs from the WF predictions. In addition, 
simulations also show that t c deviates from the SSS prediction [28, 29, 30, 31]. The slower dependence 
of tsss on N can be traced to the failure of the assumption that all internal chain motions occur 
faster than the process of interest. The interplay between t c and tr, which determines the validity of 
the local equilibrium condition, can be expressed in terms of well known exponents that characterize 
equilibration and relaxation properties of the polymer chain. Comparison of the conformational space 
explored by the chain ends and the available volume prior to cyclization [32] allows us to express the 
validity of the local equilibrium in terms of = (d + g)/z, where d is the spatial dimension, g is the 
des Cloizeaux correlation hole exponent that accounts for the behavior of P(R ee ) for small R ee , i.e., 
P(R ee ) <~ Rf e , and z is the dynamical scaling exponent (tr ~ Ree)- Additional discussions along 
these lines are given in Appendix A. The SSS assumption is only a valid provided 6 > 1 [33]. For 
the Rouse chain in the freely draining limit (y — 1/2, g = 0, d = 3, z = 4) gives < 1, and hence r c 
will show deviation from the SSS predictions for all N. 

The purpose of this paper is two fold, (i) The theory based on the WF formalism and simulations 
show the closure time twf ~ (RL) /D c ~ N 1+2v [y w 3/5 for self-avoiding walk and v = 1/2 for the 
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Rouse chain) where D c is a diffusion constant. We show that the WF result for Rouse chains, twf, 
can be obtained within the SSS framework, provided an effective diffusion constant that accounts for 
the relaxation dynamics of the ends of the chains is used instead of the monomer diffusion coefficient 
D . Thus, the simplicity of the SSS approach can be preserved while recovering the expected scaling 
result [3, 5] for the dependence of t c on N. (ii) The use of the Rouse model may be appropriate 
for polymers or polypeptide chains near 6-conditions. In both good and poor solvents interactions 
between monomers determine the statics and dynamics of the polymer chains. The chain will 
swell in good solvents (y w 3/5) whereas in poor solvents, polymers and polypeptide chains adopt 
compact globular conformations. In these situations, interactions between the monomers or the 
amino acid residues affect r c . The monomer-monomer interaction energy scale, e^j, leading to 
the chain adopting a swollen or globular conformation influences both v and the chain relaxation 
dynamics, and hence affects r c . Because analytic theory in this situation is difficult, we provide 
simulation results for r c as a function of 6lj and for 10 < N < 100. 

2 Derivation of twf for the Rouse Model using the SSS Ap- 
proximation 

The Rouse chain consists of N beads, with two successive beads connected by a harmonic potential 
that keeps them at an average separation b, (the Kuhn length). Contact formation between the 
chain ends can occur only if fluctuations result in monomers 1 and N being within a capture radius 
a. In other words, the space explored by the chain ends must overlap within the contact volume 
<~ a 3 . There are three relevant time scales that affect loop closure dynamics; namely, t ~ b 2 /D , the 
fluctuation time scale of a single monomer, r ee , the relaxation time associated with the fluctuations 
of the end-to-end distance, and tr, the relaxation time of the entire chain. Clearly, r ee < r c ~ tr. 
Because loop formation can occur only if the ends can approach each other, processes that occur 
on time scale r ee must be coupled to looping dynamics. We obtain the scaling of t c with TV, found 
using the WF approximation, from the SSS formalism using a diffusion constant evaluated on the 
time scale r ee . 



2.1 Fluctuations in R ee 

The Langevin equation for a Gaussian chain is [34] 

where ff(s,t) a white noise force with (ff(s,t)) = 0, (ff(s,t) ■ ff(s',t')) = 6jkRT5(t — t')S(s — s'). 
7 is the friction coefficient, and Dq = kRT/^f is the microscopic diffusion coefficient. By writing 
r(s, t) = r + 2 J2n=i r n(t) cos(nns/N), the Gaussian Hamiltonian H becomes 

The equation of motion for each mode 

. , , 3n 2 ir 2 D , . 

r «W = X2^*n(t)+Ut)- (3) 
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can be solved independently. The solutions naturally reveal the time scale for global motions of the 
chain, tr — N 2 b 2 /3D n 2 ~ N 2 b 2 /D - We note that tr is much larger than the relevant time scale 
for internal motions of the monomers, t\ w b 2 /D for large N. Eq. (3) can be solved directly, and 
the fluctuations in the end-to-end distance R ee are given by 
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with (c5Rg e (t)) = ([R ee (i) — R ee (0)] 2 ). The details of the calculation leading to eq. (4) are given in 
Appendix C. If we define an effective diffusion constant using 

D(t > = ™ (5) 

then -D(O) = 2Dq, as is expected for the short time limit [4, 30]. On time scales on the order of tr, 
we find D(tr) ~ D /N, which is identical to the diffusion constant for the center of mass of the 
chain [34]. This is the expected result for the diffusion constant for global chain motion. 

2.2 The Effective Diffusion Constant 

The theory of Szabo, Schulten, and Schultcn [4] (SSS) determines the loop closure time by replacing 
the difficult polymer problem, having many degrees of freedom, with a single particle diffusing in a 
potential of mean force. With this approximation, t c , which can be related to the probability that 
the contact is not formed (see Appendix B for more details), becomes 

-r i-Nb i / *Nb \ 2 -. 

T -jrJ. dr wW¥Al d "' Pi "V + ^rm- (6) 

where loop closure occurs when |R ee | = a, the closure (or capture) radius, with rate n, P(r) is 
the equilibrium end-to-end distribution of the chain, and J\f = J^ b dr P(r). In this paper, we will 
consider only a chemically irreversible process, with the binding rate constant n — > oo. In the case 
of the non-interacting Gaussian chain, P(r) <~ r 2 exp(— 3r 2 /2Nb 2 ). If D(r) ~ Do is a constant, it is 
simple to show [4] that, for large N, the loop closure time is 

TSSS ~ 3~V 6 ~Do~a' (7) 

The scaling of t$ss with N given in eq. (7) disagrees with other theories [3, 7] and numerous 
simulations [28, 29, 30, 31] that predict r c - N 2 for Nb 2 > a 2 and a > b. It has been noted [25, 33] 
that the SSS theory may be a lower bound on the loop closure time for a freely draining Gaussian 
chain, and that an effective diffusion coefficient that is smaller than Do is required to fit the simulated 
[25] and experimental [35] data using tsss- Physically, the use of a smaller diffusion constant is 
needed because contact formation requires fluctuations that bring |R ee | within the capture radius 
a, a mechanism in which r ee plays a crucial role. 

As noted by Doi [5], the relevant time scale for loop closure is not simply the global relaxation 
time. The fluctuations in R ee are given not only by the longest relaxation time, but also from 
important contributions that arise from higher modes. This gives rise to the differences between the 
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Harmonic Spring and Rouse models [5, 29]. In the Harmonic Spring model, the chain is replaced with 
only one spring which connects the two ends of the chain. The spring constant is chosen to reproduce 
the end-to-end distribution function. The higher order modes give rise to excess fluctuations on a 
scale ~ OA^/Nb = R' , and their inclusion is necessary to fully capture the physics of loop closure. 
In the approximation of a particle diffusing in an effective potential (as in the SSS theory), this 
time scale is simple to determine. If we consider only the x component of R ee , we can treat it as a 
particle diffusing in a potential U efs {R x ) = 3R 2 x /2Nb 2 - 0(1), with diffusion constant D = 2D . In 
this case, we find 

(SRl(t))= 2 -Nb 2 (l-e-^A, (8) 



3 

and (Rg e (t)) = 3(SR 2 (t)), giving the natural end-to-end relaxation time r ee = Nb 2 /6D - Because 
we have evaluated r ee using diffusion in an effective potential, the dependence of r ee on N should 
be viewed as a mean field approximation. 

We can determine the effective diffusion constant on the time scale r ee , which includes the 
relaxation of R ee (*) at the mean field level. We define the effective diffusion constant as 

^ e =lim«M. (9) 

with (<5Rg e (i)) in cq. (4), which includes all of the modes of the chain, and not simply the lowest 
one. Noting that t ee /tr <~ TV -1 <C 1 for large N, we can convert the sum in eq. (4) into an integral: 

(SKI) « ^Nh* r^Wf^U-e-A (10) 
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In particular, for t w r ee /2 = Nb 2 /12D Q , 
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D^^-^+O^-i). (12) 



We expect these coefficients to be accurate to a constant on the order of unity. The effective diffusion 
constant D ee takes the higher order modes of the chain into account, and should capture the essential 
physics of the loop closure. In other words, on the time scale r ee , resulting in D ee <~ 7V~2 ; the 
monomers at the chain ends are within a volume <~ a 3 , so that contact formation is possible. 
Substituting D ee into eq. (7) gives 

Tc ~ ^vmTa ~ TWF > (13) 

in the limit of large N. Thus, within the SSS approximation, the N 2 dependence of t c may be 
obtained, provided the effective diffusion constant D ee is used. The importance of using a diffusion 
constant that takes relaxation dynamics of R ee into account has also been stressed by Portman 
[25]. The closure time in eq. (13) depends on the capture radius as a^ 1 , which disagrees with the 
a-indcpcndcnt prediction of Doi [5]. In addition, eq. (13) does not account for the possibility of 
t c <~ N° T , with 1.5 < a T < 2, as observed with simulations by Pastor et. al. [28] when the capture 
radius a < b. Both of these discrepancies are discussed in the next section, using insights garnered 
from simulations. 
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2.3 Simulations of Loop Closure Time for Freely Jointed Chains 



In order to measure R. ee (£) and t c for a non-interacting Freely Jointed chain, we have performed 
extensive Brownian dynamics simulations. We model the connectivity of the chain using the Hamil- 
tonian 

^.^.t^piy, (14) 

i— 1 x 7 

with b = 0.38nm, and a spring constant k s — 100. We note that ((r i+ i — r^) 2 )^ w 0.39nm for 
this Hamiltonian, which we take as the Kuhn length b when fitting the data. For large N, the 
differences between the FJC and Rouse models arc not relevant, and hence the scaling of t c with 
N for these two models should be identical. The microscopic diffusion coefficient was taken as 
D a = 0.77nm 2 /ns. The equations of motion in the overdamped limit were integrated using the 
Brownian dynamics algorithm [36], with a time step of At = 10 _4 ns. The end-to-end distribution 
P(r) is easily computed for the model in eq. (14), giving the expression for large k s 

D , n o f 00 , ■ f n /Mcos(M) + k s sm(b q) -btf/kA*' 1 nx\ 
P(r)=2rJ o dqqS m( q r){ ^ - - } e o* / . j , (15) 

which must be numerically integrated. 

In our simulations, we computed the mean first passage time directly. We generated the initial 
conditions by Monte Carlo equilibration. Starting from each equilibrated initial configuration, the 
equations of motion were integrated until |R ee | < a for the first time, with the first passage time 
computed for multiple values of N and a. The loop closure time r c was identified with the mean 
first passage time, obtained by averaging over 400 independent trajectories. For comparison with 
the analytic theory, we calculated the modified SSS first passage time, with P(r) given in eq. (15), 
and D ee given in eq. (12). The results are shown in Fig. 1. We find that the behavior of r c depends 
strongly on the ratio a/b. 

a> b : For N < 100 and a > b, we find that the modified SSS theory using the effective diffusion 
constant D ee in eq. (12) gives an excellent fit to the data, as a function of both N and a (Fig. 1(A)). 
Thus, modeling the loop closure process as a one-dimensional diffusive process in a potential of mean 
force is appropriate, so long as a diffusion coefficient that takes the dynamics of the chain ends into 
account is used. 

For N > 100 and a > b, we notice significant deviations in the data from the theoretical curves. 
The data points appear to converge as a is varied for large N, suggesting the emergence of Doi's 
[5] predicted scaling of r c ~ N 2 a°. This departure from the predictions of eq. (13) suggests that 
the one-dimensional mean field approximation, which gives rise to the a dependence of r c , breaks 
down. Even our modified theory, which attempts to include fluctuations in R ee on a mean field level 
leading to D ee , cannot accurately represent the polymer as a diffusive process with a single degree of 
freedom for large N. In this regime, the many degrees of freedom of the polymer must be explicitly 
taken into account, making the WF theory [3] more appropriate. 

a < b : The condition a < b is non-physical for a Freely Jointed Chain with excluded volume, 
and certainly not relevant for realistic flexible chains in which excluded volume interaction between 
monomers would prevent the approach of the chain ends to distances less than b. (Note that for 
Wormlike Chains, with the statistical segment l p > b, the equivalent closure condition a < l p is 
physically realistic. The effect of chain stiffness, which has been treated elsewhere [33], is beyond 



G 



the scope of this article.) In this case (Fig. 1(B)), we find t c <~ N aT , with 1.5 < a T < 2, in 
agreement with the simulation results of Pastor et. al. [28]. In deriving D ee , we assumed, as did 
Doi [5], that the relaxation of the end-to-end vector is rate limiting. Once |R ee | <~ R' w OA^/Nb, 
we expect the faster internal motions of the chain will search the conformational space rapidly, so 
that r c is dominated by the slower, global motions of the chain (i.e. it is diffusion limited). This 
assumption breaks down if a <C b, because the endpoints must search longer for each other using the 
rapid internal motions on a time scale b 2 /Dq. In the limit of small a, the memory of the relaxation 
of the ends of the chain is completely lost. Our derivation of D ee , using a mean field approach, can 
not accurately describe the finer details when the endpoints search for each other over very small 
length scales, and hence our theory must be modified in this regime. 

We view the loop closure for small a (< b) as a two step process (Fig. 2), with the first being 
a reduction in |R ee | ~ b. The first stage is well modeled by our modified SSS theory (see Fig. 
1(A)) using the effective diffusion coefficient in cq. (12). The second stage involves a search for 
the two ends within a radius b, so that contact can occur whenever |R ee | = a < b. The large scale 
relaxations of the chain are not relevant in this regime. We therefore introduce a scale-dependent 
diffusion coefficient 

8D /VNn 



D„{x)*\ YA- ( 16 ) 



Substitution of eq. (16) into eq. (6) with P(r) given by eq. (15) yields, for a < b, 

N 2 b 2 ir N 3 / 2 b 2 (b-a)^ 

~ 24\/6 A) + &V6D a 

In Fig. 1(B), we compare the predictions of eq. (17) for the closure time to the simulated data for 
a < b. The fit is excellent, showing that the simple scale-dependent diffusion coefficient (eq. (16)), 
that captures the two stage mechanism of cyclization when a < b, accurately describes the physics 
of loop closure for small a. By equating the two terms in cq. (17), we predict that the TV 3 / 2 scaling 
will begin to emerge when N < 166 2 (a/6 — l) 2 /a 2 7r. This upper bound on N is consistent with the 
predictions of Chen et. al. [30]. 

An alternate, but equivalent, description of the process of loop formation for small a can also be 
given. After the endpoints are within a sphere of radius b, chain fluctuations will drive them in and 
out of the sphere many times before contact is established. This allows us to describe the search 
process using an effective rate constant n e ff, schematically shown in Fig. 2. For small a, the loop 
closure (a search within radius b) becomes effectively rate limited as opposed to diffusion limited 
[35] contact formation. The search will be successful, in the SSS formalism, on a time scale 



hptmiJ!"™)'- (18) 



with M' = £drP(r). A gain, we have taken D = 2D in this regime, because loop formation is 
dominated by the fast fluctuations of the monomers, which occur on the time scale of b 2 /D . For 
a pa b, T^ a s=a (a — b) 2 /6Dq, whereas Tf,_> a 6 3 /6aZ?o as a —> 0. rt^ a can be used to define the 
effective rate constant n e ff cx (b — a)/Tb^ a . This can be substituted into eq. (6), and gives the 
approximate loop closure time as a — > 

1 N z / 2 b z 

Tc {a)-T c {b)* KBffAffp{b) <x^- , (19) 
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reproducing the same scaling for small a as in eq. (17). 

The two-stage mechanism for the cyclization kinetics for a/b < 1 is reminiscent of the two-state 
kinetic mechanism used to analyze experimental data. The parameter K e fj is analogous to the 
reaction limited rate [35]. If the search rate within the capture region given by n e ff is small, then 
we expect the exponent a T < 2. Indeed, the experiments of Buscaglia et. al. suggest that a T 
changes from 2 (diffusion- limited) to 1.65 (reaction-limited). Our simulation results show the same 
behavior a T = 2 for a/b > 1, which corresponds to a diffusion limited process, and a T 1.65 for 
a/b = 0.1, in which the search within a/b < 1 becomes rate limiting. 



3 Loop Closure for Polymers in Good and Poor Solvents 

The kinetics of loop closure can change dramatically when interactions between monomers are taken 
into account. In good solvents, in which excluded volume interactions between the monomers domi- 
nate, it is suspected that only the scaling exponent in the dependence of r c on TV changes compared 
to Rouse chains. However, relatively little is known about the kinetics of loop closure in poor 
solvents in which enthalpic effects, that drive collapse of the chain, dominate over chain entropy 
Because analytic work is difficult when monomer-monomer interactions become relevant we resort 
to simulations to provide insights into the loop closure dynamics. 

3.1 Simulation of Cyclization Times 

The Hamiltonian used in our simulations is H = Hfene + Hlj, where 



N 



H 



FENE 



J2 lo s 



Rq 



(20) 



models the chain connectivity, with k = 22.2fcsT, and b = 0.38nm. The choice i? = 26/3 (diverging 
at |r i+ i — rj| = 6/3 or 56/3) allowed for a larger timestep than using [36] Rq = 6/2, and increased 
the efficiency of conformational sampling. The interactions between monomers are modeled using 
the Lcnnard-Joncs potential, 



JV-2 N 

H L j = £lj H 

1=1 j=i+2 



(21) 



with Yij —Yi— Yj. The Lennard-Jones interaction between the covalcntly bonded beads and r^ + i 
arc neglected to avoid excessive repulsive forces. The second virial coefficient, defining the solvent 
quality, is given approximately by 



d y 



1 - exp - PH lj (y) 



(22) 



with (3 = 1/ksT. In a good solvent v 2 > 0, while in a poor solvent v 2 < 0. A plot of v 2 as a function 
of e Lj given in Fig. 3(A) shows that v 2 > when < 0.3 and v 2 < if (3eL.j > 0.3. In what 

follows, we will refer to f3eLj — 0.4 as weakly hydrophobic and /Selj = 1-0 as strongly hydrophobic. 
The classification of the solvent quality based on eq. (22) is approximate. The precise determination 
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of the O-point (i>2 ~ 0) requiring the computation of t>2 for the entire chain. For our purposes, this 
approximate demarcation between good, 6, and poor solvents based on eq. (22) suffices. 

To fully understand the effect of solvent quality on the cyclization time, we performed Brownian 
dynamics simulations for fie^j = i/10, with 1 < i < 10. In our simulations, N was varied from 
7 to 300 for each value of e LJ , with a fixed capture radius of a = 2b = 0.76nm. The loop closure 
time was identified with the mean first passage time. The dynamics for each trajectory was followed 
until the two ends were within the capture radius a. Averaging the first passage times over 400 
independent trajectories yielded the mean first passage time. The chains were initially equilibrated 
using parallel tempering (replica exchange) Monte Carlo [37] to ensure proper equilibration, with 
each replica pertaining to one value of e^j. In Fig. 3(B), we show the scaling of the radius of 
gyration (Rg) as a function of N. We find (Rg e ) ~ N~s for the good solvent and (Rg e ) ~ N for the 
9 solvent (/3clj = 0.3). In poor solvents (/3clj > 0.3), the large N scaling of (Rg e ) ~ Ni is not 
observed for the values of N used in our simulations. Similar deviation from the expected scaling 
of (Re e ) with N have been observed by Rissanou et. al. [38] for short chains in a poor solvent. 
Simulations using much longer chains (N > 5000) may be required to observe the expected scaling 
exponent of 2/3. 

Brownian dynamics simulations with D = 0.77nm 2 /ns (= ksT '/&TTrjb, with r\ = 1.5cP) were 
performed to determine r c . The loop closure time for the chains in varying solvent conditions is 
shown in Fig. 4(A) and (B). The solvent quality drastically changes the loop closure time. The 
values of t c for the good solvent (/3£lj — 0.1) are nearly three orders of magnitude larger than in 
the case of the strong hydrophobe [fithj = 1-0) for N = 80 (Fig. 4(A)). For N in the range of 20 
to 30, that are typically used in experiments on tertiary contact formation in polypeptide chains, 
the value of r c is about 20ns in good solvents, whereas in poor solvents r c is only about 0.3ns. The 
results are vividly illustrated in Fig. 4(B), which shows r c as a function of €lj for various N values. 
The differences in r c are less pronounced as N decreases (Fig. 4(B)). The absolute value of r c for 
N w 20 is an order of magnitude less than obtained for t c in polypeptides [35] . There could be two 
inter-related reasons for this discrepancy. The value of D , an effective diffusion constant in the SSS 
theory, extracted from experimental data and simulated P(R ee ) is about an order of magnitude less 
than the Do in our paper. Secondly, Buscaglia et al. [35] used the WLC model with excluded volume 
interactions whereas our model does not take into account the effect of bending rigidity. Indeed, we 
had shown in an earlier study [33] that chain stiffness increases t c . Despite these reservations, our 
values of r c can be made to agree better with experiments using 77 sa 5cP [9] and a slightly larger 
value of b. Because it is not our purpose to quantitatively analyze cyclization kinetics in polypeptide 
chains we did not perform such comparison. 

We also find that the solvent quality significantly changes the scaling of t c ~ N aT , as shown in 
Fig. 4(C). For the range of N considered in our simulations, t c does not appear to vary as a simple 
power law in N (much like (Rg); see Fig. 3(B)) for (3€ LJ > 0.3. The values of t c in poor solvents 
shows increasing curvature as N increases. However, if we insist that a simple power law describes 
the data then for the smaller range of N from 7 to 32 (consistent with the methods of other authors 
[22, 35, 16]), we can fit the initial slopes of the curves to determine an effective exponent a T (4(C)), 
i.e. t c i=a ToN aT . In the absence of sound analytical theory, the extracted values of a T should be 
viewed as an effective exponent. We anticipate that, much like the scaling laws for (Rg), the final 
large N scaling exponent for t c will only emerge for [38] N > 5000, which is too large for accurate 
simulations. However, with the assumption of a simple power law behavior for small N, we find that 
the scaling exponent precipitously drops from a T s=a 2.4 in the good solvent to a T w 1.0 in the poor 
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solvent. Our estimate of a T in good solvents is in agreement with the prediction of Debnath and 
Cherayil [7] (a T ~ 2.3 — 2.4) or Thirumalai [39] (a T ss 2.4), and is fairly close to the value obtained in 
previous simulations [31] (a T s=a 2.2). The differences between the simulations may be related to the 
choice of the Hamiltonian. Podtclczhnikov and Vologodskii [31] used a harmonic repulsion between 
monomers to represent the impenetrability of the chain, and took a/b < 1 in their simulations. 

In contrast to the good solvent case, our estimate of a T in poor solvents is significantly lower than 
the predictions of Debnath and Cherayil [7], who suggested a T « 1.6 — 1.7, based on a modification 
of the WF formalism [3]. However, fluorescence experiments on multiple repeats of the possibly 
weakly hydrophobic glycine and serine residues in D 2 have found r c <~ TV 1 ' 36 for short chains 
[22] and r c ~ iV 105 for longer chains [16], in qualitative agreement with our simulation results. 
Bending stiffness [26, 33] and hydrodynamic interactions may make direct comparison between 
these experiments and our results difficult. The qualitative agreement between simulations and 
experiments on polypeptide chains suggest that interactions between monomers are more important 
than hydrodynamic interactions, which are screened. 

3.2 Mechanisms of Loop Closure in Poor Solvents 

The dramatically smaller loop closure times in poor solvents than in good solvents (especially for 
N > 20; see Fig 4(B)) requires an explanation. In poor solvents, the chain adopts a globular 
conformation with the monomer density pb 3 ~ 0(1), where p ~ N/R 3 . We expect the motions of 
the monomers to be suppressed in the dense, compact globule. For large N, when entanglement 
effects may dominate, it could be argued that in order for the initially spatially separated chain 
ends (|R ee |/a > 1) to meet, contacts between the monomer ends with their neighbors must be 
broken. Such unfavorable events might require overcoming enthalpic barriers (w Q x e^j, where Q 
is the average number of contacts for a bead in the interior of the globule), which would increase r c . 
Alternatively, if the ends search for each other using a diffusive, reptation-like mechanism without 
having to dramatically alter the global shape of the collapsed globule, t c might decrease as €lj 
increases (i.e. as the globule becomes more compact). It is then of interest to ask whether looping 
events are preceded by global conformational changes, with a large scale expansion of the polymer 
that allows the endpoints to search the volume more freely, or if the endpoints search for each other 
in a highly compact, but more restrictive, ensemble of conformations. 

In order to understand the mechanism of looping in poor solvents, we analyze in detail the end- 
to-end distance |R ee (i)| and the radius of gyration |R s (t)| for two trajectories (with {3clj = 1 and 
N = 100). One of the trajectories has a fast looping time (t cF w 0.003ns), while the looping time 
in the other is considerably slower (t cS w 4.75ns). Additionally, we compute the time-dependent 
variations of the coordination number, Q(t) for each endpoint. We define two monomers i and j to 
be in 'contact' if |r, —tj\ > 1.236 (beyond which the interaction energy Elj > —elj/2), and define 
Q\(t) and Qjv(i) to be the total number of monomers in contact with monomers 1 and N respectively. 
We do not include nearest neighbors on the backbone when computing the coordination number, 
and the geometrical constraints gives < Q(t) < 11 for either endpoint. With this definition, an 
endpoint on the surface of the globule will have Q — 5. These quantities are shown in Figs 5 and 6. 

The trajectory with r c p (Fig 5) shows little variation in either |R S | or |R ee |. We find |R ee | w |R 9 |, 
suggesting that the endpoints remain confined within the dense globular structure throughout the 
looping process. This is also reflected in the coordination numbers for both of the endpoints, with 
both Qi(t) and Qn^) are in the range 5 < Q(t) < 10 throughout the simulation. The endpoints in 
this trajectory, with the small loop closure time , always have a significant number of contacts, 
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and traverse the interior of the globule when searching for each other. Similarly, we also found that 
the trajectory with a long first passage time r,f (Fig 6) shows little variation in R g throughout 
the run. The end-to-end distance, however, shows large fluctuations over time, and (Rg e ) > 2(R^) 
until closure. This suggests that, while the chain is in an overall globular conformation (small, 
constant Rg), the cndpoints arc mainly found on the exterior of the globule. This conclusion is 
again supported by the coordination number, with Q(t) < 5 for significant portions of the simulation. 
While the cndpoints are less restricted by nearby contacts and able to fluctuate more, the endpoints 
spend much longer time searching for each other. Thus, it appears that the process of loop formation 
in poor solvents, where cnthalpic effects might be expected to dominate for N — 100 occurs by a 
diffusive, reptation-like process. Entanglement effects are not significant in our simulations. 

We note that trajectories in which the first passage time for looping is rapid (with r ci < r c for 
trajectory i) have at least one endpoint with a high coordination number (Q > 5) throughout the 
simulation. In contrast for most slow-looping run (with r ci > t c ), we observe long stretches of time 
where both endpoints have a low coordination number {Q < 5). These results suggest that motions 
within the globule are far less restricted than one might have thought, and loop formation will occur 
faster when the endpoints are within the globule than it would if the endpoints were on the surface. 
The longer values of r c are found if the initial separation of the end points is large, which is more 
likely if they are on the surface than buried in the interior. The absence of any change in |R ff (i)| in 
both the trajectories, which represent the extreme limits in the first passage time for looping, clearly 
shows that contact formation in the globular phase is not an activated process. Thus, we surmise 
that looping in poor solvents occurs by a diffusive, reptation-like mechanism, provided entanglement 
effects are negligible. 

3.3 Separating the Equilibrium Distribution P(R ee ) and Diffusive Pro- 
cesses in Looping Dynamics 

The results in the previous section suggest a very general mechanism of loop closure for interacting 
chains. The process of contact formation for a given trajectory depends on the initial separation R ee , 
and the dynamics of the approach of the ends. Thus, r c should be determined by the distribution of 
P(R ee ) (an equilibrium property), and an effective diffusion coefficient D(t) (a dynamic property). 
We have shown for the Rouse model that such a deconvolution into an equilibrium and dynamic 
part, which is in the spirit of the SSS approximation, is accurate in obtaining r c for a wide range of 
TV and a/b. It turns out that a similar approach is applicable to interacting chains as well. 

The decomposition of looping mechanisms into a convolution of equilibrium and dynamical parts 
explains the large differences in t c as the solvent quality changes. We find, in fact, that the equilib- 
rium behavior of the endpoints dominates the process of loop formation, with the kinetic processes 
being only weakly dependent on the solvent quality for short chains. In Fig. 7(A), we plot the end- 
to-end distribution function for weakly (I3clj = 0.4) and strongly (fleLj = 1) hydrophobic polymer 
chains. The strongly hydrophobic chain is highly compact, with a sharply peaked distribution. The 
average end-to-end distance is significantly lower than is the weakly hydrophobic case. While the 
distribution function is clearly strongly dependent on the interactions, the diffusion coefficient D(t) 
is only weakly dependent on the solvent quality (Fig. 7(B)). The values of D(t) = {8K? ee )/Qt are 
only reduced by a factor of about 2 between the (3tLj = 0.1 (good solvent, with a globally swollen 
configuration) and the Pclj = 1.0 (poor solvent, with a globally globular configuration) on interme- 
diate time scales. We note, in fact, that the good solvent and 6 solvent cases have virtually identical 
diffusion coefficients throughout the simulations (Fig 7(B)). This suggests that the increase in r c 
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(Fig 4) between the Rouse chains and the good solvent chains is primarily due to the broadening of 
the distribution P(R ee ), i.e. the significant increase in the average end-to-end distance in the good 
solvent case, (R^) ~ N 2l/ , with v = 3/5. 

Because of the weak dependence of the diffusion coefficient on the solvent quality, the loop 
closure time is dominated primarily by the end-to-end distribution function. In other words, the 
equilibrium distribution function P(R ee ), to a large extent, determines r c . To further illustrate these 
arguments, we find that if we take D w 2Dq in cq. (6) and numerically integrate the distribution 
function found in the simulations for N = 100, T c {(3eLj — 1.0) and T c {(3eL.j — 0.4) differ by two 
orders of magnitude, almost completely accounting for the large differences seen in Fig. 4(B) between 
the two cases. Moreover, if the numerically computed values of D(t) for long t (t > 0.5ns in Fig. 7 for 
example) is used for D ee in eq. (6) we obtain values of r c that are in reasonably good agreement with 
simulations. The use of D ee ensures that the dynamics of the entire chain is explicitly taken into 
account. These observations rationalize the use of P(R ee ) with a suitable choice of D ee in obtaining 
accurate results for flexible as well as stiff chains [33, 40]. Because P(R ee ) can, in principle, be 
inferred from FRET experiments [41, 42] the theory outlined here can be used to quantitatively 
predict loop formation times. In addition, FRET experiments can also be used to assess the utility 
of polymer models in describing fluctuations in single stranded nucleic acids and polypeptide chains. 

3.4 Kinetics of interior loop formation 

We computed the kinetics of contact between beads that are in the chain interior as a function of 
solvent quality (Fig. 8(A)) using N = 32. The mean time for making a contact is computed using 
the same procedure as used for cyclization kinetics. For simplicity we only consider interior points 
that arc equidistant (along the chain contour) from the chain ends. The ratio rj, which measures 
the change in the time for interior loop formation relative to cyclization kinetics, depends on fithj 
and l/N, where I is the separation between the beads (Fig. 8(A)). The non-monotonic dependence 
of ri on / in good solvents further shows that as l/N decreases to about 0.6, r\ s=s 1. The maximum 
in n at l/N w 0.9 decreases as /3e_Lj increases. In the poorest solvents considered (/3c_lj = 0.8), we 
observe that r; only decreases monotonically with decreasing l/N. Interestingly, in poor solvents, r; 
can be much less than unity which implies that it is easier to establish contacts between beads in 
the chain interior than between the ends. This prediction can be verified in polypeptide chains in 
the presence of inert crowding agents that should decrease the solvent quality. Just as in cyclization 
kinetics, interior loop formation also depends on the interplay between internal chain diffusion that 
gets slower as the solvent quality decreases and equilibrium distribution (which gets narrower) of 
the distance between the contacting beads. 

We also performed simulations for N — 80 by first computing the time for cyclization t®°. In 
another set of simulations, two flexible linkers each containing 20 beads were attached to the ends 
of the N — 80 chain. For the resulting longer chain we calculated r; for I — 80 as a function of /3clj. 
Such a calculation is relevant in the context of single molecule experiments in which the properties 
of a biomolecule (RNA) is inferred by attaching linkers with varying polymer characteristics. It is 
important to choose the linker characteristics that minimally affects the dynamic properties of the 
molecule of interest. The ratio r; =8 o/r ( 80 depends on (3clj and changes from 2.6 (good solvents) to 
2.0 under 6 condition and becomes unity in poor solvents (Fig. 8(B)). Analysis on the dependence 
of the diffusion coefficients of interior-to-interior vector Dy (i — 20 and j = 100) and end-to-end 
vector (of original chain without linkers) D ee on solvent conditions indicates that on the time scales 
relevant to loop closure time (analogous to r ee for the Rouse chain), Dij reduces to about one half 
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of D ee in good and O solvents, whereas the two are very similar in poor solvents. The changes in 
the diffusion coefficient together with the equilibrium distance distribution explains the behavior in 
Fig. 8(B). 

4 Conclusions 

A theoretical description of contact formation between the chain endpoints is difficult because of the 
many body nature of the dynamics of a polymer. Even for the simple case of cyclization kinetics 
in Rouse chains, accurate results for r c arc difficult to obtain for all values of N, a, and b. The 
present work confirms that, for large N and a/b > 1, the looping time must scale as N 2 , a result 
that was obtained some time ago using the WF formalism [3, 5]. Here, we have derived t c ~ iV 2 
(for N ^> 1 and a > b) by including the full internal chain dynamics within the simple and elegant 
SSS theory [4]. We have shown that, for N < 100 and especially in the (unphysical) limit a/b < 1, 
the loop closure time r c ~ ro-/V" T with 1.5 < a T < 2. In this limit, our simulations show that 
loop closure occurs in two stages with vastly differing time scales. By incorporating these processes 
into a scale-dependent diffusion coefficient, we obtain an expression for r c that accurately fits the 



simulation data. The resulting expression for t c for a < b (eq. (17)) contains both the N% and N [ 
limits, as was suggested by Pastor et. al. [28] 

The values of r c for all N change dramatically when interactions between monomers are taken 
into account. In good solvents, r c <~ ToN aT (a T w 2.4) in the range of N used in the simulations. 
Our exponent a T is in reasonable agreement with earlier theoretical estimates [39, 7]. Polypeptide 
chains in high denaturant concentrations may be modeled as flexible chains in good solvents. From 
this perspective, the simple scaling law can be used to fit the experimental data on loop formation 
in the presence of denaturants using physical values of tq. Only when N is relatively small (N ss 4) 
will chain stiffness play a role in controlling loop closure times. Indeed, experiments show that r c 
increases for short N (see Fig. 3 in Rcf. 15), and deviates from the power law behavior given in eq. 
(7) for all N, which is surely due to the importance of bending rigidity. 

The simulation results for t c in poor solvents show rich behavior that reflects the extent to which 
the quality of the solvent is poor. The poorness of the solvent can be expressed in terms of 

A " e LJ (S) (23) 

where the O-solvent interaction strength /3clj(0) rs 0.3 is determined from V2 ~ (Fig. 3). Loop 
closure times decrease dramatically as A increases. For example, r c decreases by a factor of about 
100 for N = 80 as A increases from to 2.3. In this range of N, a power law fit of r c with TV 
(t c <~ N aT ) shows that the exponent a T depends of A. Analysis of the trajectories that monitor loop 
closure shows that contact between each end of the chains is established by mutual, reptation-likc 
motion within the dense, compact globular phase. 

The large variations of r c as A changes suggests that there ought to be significant dependence 
of loop formation rates on the sequence in polypeptide chains. In particular, our results suggest 
that as the number of hydrophobic residues increase, r c should decrease. Similarly, as the number 
of charged or polar residues increase, the effective persistence length (l p ) and interactions can be 
altered, which in turn could increase r c . Larger variations in r c , due to its dependence on l p and N, 
can be achieved most easily in single-stranded RNA and DNA. These arguments neglect sequence 
effects, which are also likely to be important. The results in Fig. 4(B) may also be reminiscent of 
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"hydrophobic collapse" in proteins especially as A becomes large. For large A and long N it is likely 
that t c correlates well with time scales for collapse. This scenario is already reflected in P(R ee ) (see 
Fig. 7(A)). It may be possible to discern the predictions in Fig. 4(B) by varying the solvent quality 
for polypeptides. Combination of denaturants (makes the solvent quality good) and PEG (makes 
it poor) can be used to measured r c in polypeptide chains. We expect the measured r c should be 
qualitatively similar to the findings in Fig. 4(B). 

The physics of loop closure for small and intermediate chain lengths (N < 300) is rather com- 
plicated, due to contributions from various time and length scales (global relaxation and internal 
motions of the chains). The contributions from these sources are often comparable, making the 
process of looping dynamics difficult to describe theoretically. A clear picture of the physics is ob- 
tained only when one considers all possible ranges of the parameters entering the loop closure time 
equation. To this end, we have explored wide ranges of conceivable parameters, namely the chain 
length N, capture radius a, and conditions of the solvents expressed in terms of e^j. By combining 
analytic theory and simulations, we have shown that, for a given N, the looping dynamics in all 
solvent conditions is primarily determined by the initial separation of the end points. The many 
body nature of the diffusive process is embodied in D(t), which does not vary significantly as A 
changes for a fixed N. Finally, the dramatic change in t c as A increases suggests that it may be also 
necessary to include hydrodynamic interactions, that may decrease r c further, to more accurately 
obtain the loop closure times. 

5 Appendix A 

Friedman and O'Shaughnessy [43] (FO) generalized the concept of the exploration of space suggested 
by de Gennes [44] to the cyclization reaction of polymer chain. The arguments given by dc Gcnnes 
and FO succinctly reveal the conditions under which local equilibrium is appropriate in terms of 
properties of the polymer chains. 

First, de Gennes introduced the notion of compact and non-compact exploration of space asso- 
ciated with a bimolccular reaction involving polymers. Tertiary contact formation is a particular 
example of such a process. Consider the relative position between two reactants on a lattice with the 
lattice spacing a. The two reactants explore the available conformational space until their relative 
distance becomes less than the reaction radius. One can define two quantities relevant to the volume 
spanned prior to the reaction. One comes from the actual number of jumps on the lattice defined 
as j(t) which is directly proportional to t. If the jump is performed in a d-dimcnsional lattice, 
the actual volume explored would be a d j(t). The other quantity comes from the root mean square 
distance. If x(t) ~ t u is the root mean square distance for one-dimension, x d (t) is the net volume 
explored. The comparison between these two volumes defines the compactness in the exploration of 
the space, i) The case x d (t) > a d j(t) corresponds to non-compact exploration of the space (ud > 1). 
ii) The regime x d (t) < a d j(t) represents compact exploration of the space (ud < 1). Depending on 
the dimensionality, the exploration of space by the reactive pair in the bimolccular reaction is cate- 
gorized either into non-compact (d=3) or into compact (d=l) exploration. In case of non-compact 
exploration, the bimolccular reaction takes place infrequently, so that the local equilibrium in solu- 
tion is easily reached. The reaction rate is simply proportional to the probability that the reactive 
pair is within the reaction radius, so that k ~ p eq {r < r ), which eventually leads to k = 4naD, the 
well-known steady state diffusion controlled rate coefficient. It can be shown that k <~ t _1 in case 
of compact exploration. 
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In the context of polymer cyclization the compactness of the exploration of space can be as- 
sessed using the exponent 9 = where g is the correlation hole exponent, z is the dynamic 
exponent. Since [45, 46] lim r ^ Peq(r) ~ an d the cyclization rate can be approximated by 
k ~ ft I d d rp eq {r), it follows that k ~ | (^) d+9 ■ The relations r ~ t 1 ^ and i? - r 1 / 2 lead to 
fc ~ ^ (|) z , where r is the characteristic relaxation time. 

1. If 6 > 1, then the cyclization rate is given by k <~ p eiJ (r = r ) ~ 55/(5)- which, with i? ~ TV", 
leads to the scaling relation 

r c - N^ d+g \ (24) 

2. If 6* < 1, the compact exploration of conformations occurs between the chain ends. As a result, 
the internal modes are not in local equilibrium. In this case, t c ~ 75 ~ i? z where z = 2 + ^ 
is the dynamic exponent for free draining case and z = d when hydrodynamic interactions are 
included [45, 34]. Therefore, the scaling law for cyclization rate is given by 

r c ~ N z \ (25) 

The inference about the validity of local equilibrium, based on 9, is extremely useful in obtaining the 
scaling laws for polymer cyclization, eqs. (24) and (25). Extensive Brownian dynamics simulation 
by Rey et.al.[47] have established the validity of these scaling laws. The expected scaling laws for 
three different polymer models are discussed below. 

• Free-draining Gaussian chain (d = 3, g = 0, z = 4, v = ^) : 6 = | < 1. 

Because 9 < 1, the local equilibrium approximation is not valid for a "long" free-draining 
Gaussian chain, or equivalently the Rouse model. Accordingly, we expect r c ~ N 2 for the 
Rouse chain for N 3> 1. However, if N is small and the local equilibrium is established among 
the internal Rouse modes so that r c » t r , the scaling relation change from r c <~ N 2 to 
t c <~ N aT , with a T < 2. The simulations shown here and elsewhere, [28] and the theory by 
Sokolov, [6] explicitly demonstrates that a T can be less than 2 for small N. In this sense looping 
time of free-draining Gaussian chain of finite size is bound by [25, 33] tsss < t c < t\yf- 

• Free-draining Gaussian chain with excluded volume (d = 3, g = I — - = ^,z=^,^=|): 

From eq. (25), it follows that r c ~ N 2 - 2 This polymer model has been extensively studied using 
Brownian dynamaics simulation and the value of the scaling exponent 2.2 has been confirmed 
by Vologodskii [31]. The value of the exponent (2.2) is also consistent with previous theoretical 
predictions [7, 39]. 

• Gaussian chain with excluded volume and hydrodynamic interactions (d = 3, g = A, z = 3, 
,= §):0=i>l. 

Since 9 > 1, the local equilibrium approximation is expected to hold. This polymer model 
corresponds to the flexible polymer in a good solvent. The incorporation of hydrodynamic 
interactions may assist the fast relaxation of the rapid internal modes, and changes the nature 
of cyclization dynamics from a compact to a non-compact one. The correct scaling law is 
predicted to be r c ~ TV 2 - . Since the local equilibrium approximation is correct, the first 
passage time approach [4] should give a correct estimate of r c only if the effective potential of 
mean force acting on the two ends of the chain is known. 
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6 Appendix B 



The relation between the mean first passage time t and the probability E(i) that at time t the 
system is still unreacted is exact: 



/•oo 

/ Y,[t)dt, (26) 
Jo 



for any form of E(t) for which E(0) is finite and lim t ^oo i£(£) — 0. Therefore, the stricter requirement 
that E(i) ~ exp(— t/r) in the SSS original paper [4] is not necessary. 

We define F(t) the flux (or density) of passage: F(t) = The mean first passage time is: 

t = tF(t)dt = t dt = - t dS(t). (27) 

Performing integration by parts gives: 

POO 

t = -t£(t)|~+ / £(t)dt. (28) 

By definition, E(f) must be finite, and hence tT,(t) = at t = 0. If E(t) is such that it vanishes at 
t — > oo faster than then the first term in eq. (28) vanishes and we are left with eq. (26). Note 
that these are also necessary and sufficient conditions for r in eq. (26) to be finite. 



7 Appendix C 

In formulating the fluctuations of the end-to-end distance vector, (<5Rg e ), it is important to take 
into account the failings of the continuum model of the Freely Jointed Chain. A simple calculation 
of (5Rg e (i)) with R ee (i) = r(N,t) — r(0, £) as determined from eq (1) gives 

(SRKt)) = WNb 2 ]T -1^1 -e"" 2 '/™) (29) 
n odd 

We will refer to this result as the standard analytic average. However, the non-physical boundary 
conditions imposed on the continuum representation, with dr/ds = at the endpoints, will strongly 
affect the accuracy of this result. 

To minimize the effect of the boundary conditions on averages involving the end-to-end distance, 
we compute averages with respect to the differences between the centers of mass of the first and last 
bonds, using 

rN rl 

Ree(i) ~ / dsr(s,t)- dsr(s,t). (30) 
Jn-i Jo 

We will refer to this as the center of mass average. Using this representation, (5Rg e (i)) is given in 
eq(4). 

In Fig. 9, we compare the values of D(t) obtained from (SRl e (t)) (in eq. (5)) for N — 19 and 
b = 0.39. In both cases, b is taken as a fitting parameter. The center of mass average, which fits the 
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data quite well, has a best fit of b = 0.41 (a difference of 5%), whereas the standard average does 
not give accurate results. For this reason, all averages involving R ee are computed using the center 
of mass theory. 
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Figure Captions 



Fig. 1: Dependence of r c on N for various values of a. The symbols correspond to different 
values of the capture radius. (A): The values of a/6 are 1.00 (+), 1.23 (x), 1.84 (*), 2.76 (A), 3.68 
(V), and 5.52 (<>)• The lines are obtained using eq. (6) with n — > oo. The diffusion constant in eq. 
(6) is obtained using D = {5K? ee {T ee /2) /3r ee , with (SHl e (t)} given in eq. (10). (B): The values of 
a/b are 0.10 (+), 0.25 (x), 0.50 (*), and 1.00 (A). The lines are the theoretical predictions using 
eq. (17). The poor fit using eq. (13) with a = 0.1b (solid line) shows that the two-stage mechanism 
has to be included to obtain accurate values of r c . The effective exponent a T , obtained by fitting 
t c ~ N aT , is shown in parentheses. 

Fig. 2: Sketch of the two-stage mechanism for loop closure for Rouse chains when a < b. Al- 
though unphysical, this case is of theoretical interest. In the first stage, fluctuations in R ee result 
in the ends approaching |R ee | = b. The search of the monomers within a volume 6 3 (> a 3 ), which 
is rate limiting, leads to a contact in the second stage. 

Fig. 3: (A): Second virial coefficient as a function of e^j, from eq. (22). The classification of 
solvent quality based on the values of v 2 are shown. (B): The variation of (R^) with N for different 
values of e^j. The value of (3eLj increases from 0.1 to 1.0 (in the direction of the arrow). 

Fig. 4 (A): Loop closure time as a function of N for varying solvent quality. The values of (3eLj 
increase from 0.1 to 1.0 from top to bottom, as in Fig. 3(A). (B): r c as a function of clj, which is 
a measure of the solvent quality. The values of N are shown in various symbols. (C): Variation of 
the scaling exponent of r c ~ N aT as a function of e^j. 

Fig. 5: Mechanism of loop closure for a trajectory with a short (<~ 0.003ns) first passage time. 
The values of N and fiehj are 100 and 1.0 respectively. (A): Plots of |R ee | and |R S | (scaled by the 
capture radius (a) as a function of time. The structures of the globules near the initial stage and 
upon contact formation between the ends are shown. The end to end distance is in red. (B): The 
time-dependent changes in the coordination numbers for the first (Qi(t)) and last (Qw(i)) monomers 
during the contact formation. 

Fig. 6: Same as Fig. 5, except the data are for a trajectory with a first passage time for contact 
formation that is about 4.7ns. (A): Although the values of |R ff | are approximately constant, |R ee | 
fluctuations greatly. (B): Substantial variations in Q\{t) and Qiv(i) are observed during the looping 
dynamics, in which both ends spend a great deal of time on the surface of the globule. 

Fig. 7: (A): Distribution of end-to-end distances for a weakly [fithj = 0.4) and strongly ((Hclj = 
1.0) hydrophobic chain. (B): Diffusion constant D ee (t) in units of D for varying solvent quality. 
The diffusion constant is defined using D ee (t) = (SH,l e (t)) /6t. The values of elj are shown in the 
inset. 

Fig. 8: (A): The ratio ri = ti/t c as a function of interior length Here 77 is the contact forma- 
tion time for beads that are separated by I monomers. T\ is non-monotonic for weakly hydrophobic 
chains, but decreases monotonically with decreasing I in the poorest solvents. The observed maxima 
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occur near l/N — 0.9. (B): For loop length I = 80, the ratio ti = sq/t c as a function of /3e^j for a 
chain with two linkers (each of 20 beads) that are attached to beads 20 and 100. In good solvents, 
the interior loop closure kinetics is about 2.5 times slower than the end-to-end one with the same 
loop length. In poor solvents, however, there is virtually no difference between the two. 

Fig. 9: Measured Diffusion Coefficient as a function of time for the Rouse chain with N = 19 
and b = 0.39nm. Symbols are the simulation data, the dashed line (standard average) is obtained 
using eqs. (29) and (5) (with best fit b w 0.26nm), and the solid line is the center-of-mass average 
derived using eqs (4) and (5) (with best fit 6 ~ 0.41nm). 
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